home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Wayzata's Best of Shareware PC/Windows 1
/
Wayzata's Best of Shareware for PC-Windows - Release 1 - Wayzata Technology (1993).iso
/
mac
/
DOS
/
PROGRAMG
/
GRAD
/
PAPER.EPS
< prev
next >
Wrap
Text File
|
1990-12-30
|
12KB
|
430 lines
A SYSTEM FOR THE AUTOMATIC DIFFERENTIATION
OF FORTRAN PROGRAMS
Oscar Garcia '
Forest Research Institute
Rotorua, New Zealand
ABSTRACT
A general-purpose automatic
differentiation system is described.
Given a Fortran subprogram for computing a
function, it generates a subprogram that
computes partial derivatives. The APL
computer language was used in the
implementation.
INTRODUCTION
In an effort to speed-up the estimation of parameters in
complex forest growth models, an automatic differentiation
system was developed (Garcia 1989, 1991). It takes as input a '
Fortran program unit (usually a subroutine or function
subprogram) that computes the value of a function, and
produces as output another Fortran program unit that computes
partial derivatives with respect to specified independent
variables.
I describe here the methods and implementation, and give some
figures on the performance of the generated code. Additional
discussion and evaluation of results can be found in Garcia '
(1991).
Performance appears satisfactory, and compares favorably with
that of JAKEF (Hillstrom 1990). The system may be useful in
other applications.
____________________________
Presented at the SIAM Workshop on Automatic Differentiation of
Algorithms, Breckenridge, Colorado, January 6-8, 1991.
2
PRINCIPLES AND IMPLEMENTATION
The basic idea was inspired by Wengert (1964). He suggested
substituting for each operator a routine, that in addition to
compute the result of the operation would also produce the
values of the derivatives. A somewhat similar approach was
implemented by hand in order to compute first and second
partial derivatives in a parameter estimation program (Garcia '
1983). Each statement in the function evaluation routine was
followed by statements to compute the derivatives of the left-
hand-side variable (usually an intermediate variable). For
example,
C = A * B
would be followed by
C1 = A1 * B + A * B1
C2 = A2 * B + A * B2
etc.,
where the Ai and Bi are partial derivatives with respect to
the i-th independent variable, computed in previous statements
(with obvious simplifications where the partials for some
terms do not exist). The function derivatives sought follow
the statement defining the final value of the function,
usually at the end of the subroutine. Unlike Wengert's, this
method produces stand-alone code, without the overhead of
calls to routines in a run-time package.
The system described here essentially automates this
procedure. The statements that compute derivatives precede
instead of follow each function evaluation statement, to
handle correctly statements such as
A = A * B .
For each assignment statement, the right-hand-side expression
is processed by a recursive descent parser, and Fortran code
for computing the relevant derivatives is generated.
Unnecessary parenthesis are avoided as much as possible.
Thus, most of the redundant operations generated will be
removed by any optimizing compiler. An earlier version
produced fully parenthesized expressions that would inhibit
optimization, since in Fortran parenthesis fix the order of
operations.
Identifiers for the derivatives are formed by appending to the
variable name an optional user-defined delimiter character and
the number of the independent variable. Identifiers exceeding
3
the 6-character Fortran standard maximum length might need to
be changed by the user (a warning is produced). Many Fortran
compilers, however, have the option of accepting long variable
names (e.g. using the /4Nt switch in Microsoft Fortran).
The existence of non-zero derivatives of left-hand-side
temporary variables with respect to the independent variables
is recorded in an internal data structure as the statements
are processed. Initially, new statements are generated only
for non-zero derivatives. In some instances this would not
produce correct results, and multiple passes may be needed to
initialize at zero some of the derivatives. For example, with
one pass the following statements would generate incorrect
code (X is the first independent variable and this is the
first appearance of A; derivatives of VAR with respect to the
k-th independent variable are identified as VAR_k):
IF (I .LT. 5) GO TO 10 IF (I .LT. 5) GO TO 10
A = 0. A = 0.
GO TO 20 GO TO 20
10 CONTINUE 10 CONTINUE
A = X ══> A_1 = 1.
20 CONTINUE A = X
B = A + 1 20 CONTINUE
B_1 = A_1
B = A + 1
Here A_1 = 0.0 would be needed following the IF statement.
Similar problems may arise from loops:
A = 0.
B = 0.
DO 10 I = 1,N
A = A + B
B = C(I) * X
10 CONTINUE
These cases are handled by making multiple passes over the
original code. The need for another pass is signaled if a new
non-zero derivative is encountered for a variable that had
been previously used on the left-hand-side of an assignment.
Keeping the accumulated record of non-zero derivatives causes
that new derivative to be initialized at zero in the
appropriate place in the next pass.
A result of the myopic strategy of examining only assignment
statements, one at a time, was a relatively simple and
efficient system. However, a few restrictions must be
observed in the source coding:
4
a) The independent variables must be scalars or array elements
with constant subscripts.
b) Assignments involving independent variables directly or
indirectly are not allowed in IF
or other control statements.
c) Labels are not allowed in assignment statements. Use
CONTINUE.
d) User defined functions involving the independent variables
are not supported.
In addition, the header of the output routine must be edited
manually to change its name and to include the derivatives as
arguments. It may also be necessary to declare the new
variables if default types or IMPLICIT declarations are not
used. The procedures have been used only with Fortran 66, and
designed with this dialect in mind. It is conceivable that
some additional restrictions could be needed with Fortran 77.
Second or higher derivatives can be generated simply by
running the output through the differentiator again.
Different delimiter characters for the identifiers must be
used in each run. The resulting code, however, will perform
redundant computations, since (for second derivatives) two
versions of the gradient and of the off-diagonal Hessian
elements are generated. It would not be difficult to
eliminate this redundancy in a post-processing step, but this
has not been implemented yet.
The system was written in APL, using STSC's APL*PLUS
interpreter for IBM PC or compatible microcomputers (STSC
1989). This is transparent to the user, however, and no
knowledge of APL is required to use it. The high level and
power of the APL computer language made possible to complete
the implementation in a very short time.
The system also runs without modification with the inexpensive
Pocket APL (Turner 1985), and with other APL interpreters from
STSC available for several microcomputers and mainframes. In
order to make it more accessible, a version for the public
domain I-APL interpreter has been prepared. Although all
comments were removed to make it fit into the less than 30 K
bytes of memory available under I-APL, this version cannot
process very large problems, and is about 27 times slower than
the APL*PLUS version.
5
PERFORMANCE
The run-time performance of the generated code may be assessed
by the work ratio (Griewank 1988), the ratio of the time
required to compute a gradient to the time taken by a function
evaluation. Note that the computation of the gradient
produces also a function value, which in many applications is
also needed. A forward differences approximation of the
gradient for a function with n independent variables requires
n + 1 function evaluations, that is, a work ratio of n + 1.
The work ratio for central differences is 2n, or 2n + 1 if a
function evaluation at the central point is included.
The following table shows work ratios for several examples.
Work ratios for another automatic differentiator, JAKEF
(Hillstrom 1990), are also included. The tests were carried
out on AT-compatible microcomputers with floating point
coprocessors.
Problem Variables Work ratio JAKEF
1 3 3.1 11.2
2 4 2.7 11.1
3 9 4.0 -
4 16 5.6 -
5 18 5.2 8.0
6 18 6.4 -
Problems 1 and 2 are two short subroutines used for testing
optimization algorithms. Problems 3 to 6 are large
subroutines used for parameter estimation, with about 130 to
180 Fortran statements, and containing loops running over
large numbers of observations. JAKEF uses a reverse
differentiation approach (Griewank 1988), with storage
requirements increasing with the number of statements
executed. Only problem 5, where the number of observations
was limited to 100, could be solved with JAKEF in the
available memory (Garcia 1991). '
REFERENCES
Garcia, O. 1983. A stochastic differential equation model for '
the height growth of forest stands. EBiometrics 39F, 1059-
1072.
6
Garcia, O. 1989. Growth Modelling - New developments. In '
Nagumo, H. and Konohira, Y. (Ed.) Japan and New Zealand
Symposium on Forestry Management Planning. Japan
Association for Forestry Statistics.
Garcia, O. 1991. Experience with automatic differentiation in '
estimating forest growth model parameters. Presented at
the SIAM Workshop on Automatic Differentiation of
Algorithms, Breckenridge, Colorado, January 6-8, 1991.
Griewank, A. 1988. On automatic differentiation. Argonne
National Laboratory, Mathematics and Computer Science
Division. Preprint ANL/MCS-P10-1088.
Hilstrom, K. E. 1990. User guide for JAKEF. (Available in
electronic form from netlib on Internet).
STSC 1989. APL*PLUS system for the PC - User's manual. STSC,
Inc.
Turner, J. R. 1985. Pocket APL - Reference guide. STSC,
Inc., Rockville, Maryland.
Wengert, R.E. 1964. A simple automatic derivative evaluation
program. ECommunications of the ACM 7F, 463-471.
7
APPENDIX
FORWARD AND REVERSE DIFFERENTIATION
(Omitted)